actual label
Iceberg: Enhancing HLS Modeling with Synthetic Data
Ding, Zijian, Nguyen, Tung, Li, Weikai, Grover, Aditya, Sun, Yizhou, Cong, Jason
Deep learning-based prediction models for High-Level Synthesis (HLS) of hardware designs often struggle to generalize. In this paper, we study how to close the generalizability gap of these models through pretraining on synthetic data and introduce Iceberg, a synthetic data augmentation approach that expands both large language model (LLM)-generated programs and weak labels of unseen design configurations. Our weak label generation method is integrated with an in-context model architecture, enabling meta-learning from actual and proximate labels. Iceberg improves the geometric mean modeling accuracy by $86.4\%$ when adapt to six real-world applications with few-shot examples and achieves a $2.47\times$ and a $1.12\times$ better offline DSE performance when adapting to two different test datasets. Our open-sourced code is here: https://github.com/UCLA-VAST/iceberg
Domain Adaptive Skin Lesion Classification via Conformal Ensemble of Vision Transformers
Zoravar, Mehran, Alijani, Shadi, Najjaran, Homayoun
Exploring the trustworthiness of deep learning models is crucial, especially in critical domains such as medical imaging decision support systems. Conformal prediction has emerged as a rigorous means of providing deep learning models with reliable uncertainty estimates and safety guarantees. However, conformal prediction results face challenges due to the backbone model's struggles in domain-shifted scenarios, such as variations in different sources. To aim this challenge, this paper proposes a novel framework termed Conformal Ensemble of Vision Transformers (CE-ViTs) designed to enhance image classification performance by prioritizing domain adaptation and model robustness, while accounting for uncertainty. The proposed method leverages an ensemble of vision transformer models in the backbone, trained on diverse datasets including HAM10000, Dermofit, and Skin Cancer ISIC datasets. This ensemble learning approach, calibrated through the combined mentioned datasets, aims to enhance domain adaptation through conformal learning. Experimental results underscore that the framework achieves a high coverage rate of 90.38\%, representing an improvement of 9.95\% compared to the HAM10000 model. This indicates a strong likelihood that the prediction set includes the true label compared to singular models. Ensemble learning in CE-ViTs significantly improves conformal prediction performance, increasing the average prediction set size for challenging misclassified samples from 1.86 to 3.075.
- North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.05)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Therapeutic Area > Dermatology (0.95)
- Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.38)
Arabic Tweet Act: A Weighted Ensemble Pre-Trained Transformer Model for Classifying Arabic Speech Acts on Twitter
Alshehri, Khadejaa, Alhothali, Areej, Alowidi, Nahed
Speech acts are a speakers actions when performing an utterance within a conversation, such as asking, recommending, greeting, or thanking someone, expressing a thought, or making a suggestion. Understanding speech acts helps interpret the intended meaning and actions behind a speakers or writers words. This paper proposes a Twitter dialectal Arabic speech act classification approach based on a transformer deep learning neural network. Twitter and social media, are becoming more and more integrated into daily life. As a result, they have evolved into a vital source of information that represents the views and attitudes of their users. We proposed a BERT based weighted ensemble learning approach to integrate the advantages of various BERT models in dialectal Arabic speech acts classification. We compared the proposed model against several variants of Arabic BERT models and sequence-based models. We developed a dialectal Arabic tweet act dataset by annotating a subset of a large existing Arabic sentiment analysis dataset (ASAD) based on six speech act categories. We also evaluated the models on a previously developed Arabic Tweet Act dataset (ArSAS). To overcome the class imbalance issue commonly observed in speech act problems, a transformer-based data augmentation model was implemented to generate an equal proportion of speech act categories. The results show that the best BERT model is araBERTv2-Twitter models with a macro-averaged F1 score and an accuracy of 0.73 and 0.84, respectively. The performance improved using a BERT-based ensemble method with a 0.74 and 0.85 averaged F1 score and accuracy on our dataset, respectively.
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
- Asia > Singapore (0.04)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Efficacy of Machine-Generated Instructions
Gulati, Samaksh, Verma, Anshit, Parmar, Manoj, Chaudhary, Palash
Large "instruction-tuned" language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We conducted a quantitative study to figure out the efficacy of machine generated annotations, where we compare the results of a fine-tuned BERT model with human v/s machine-generated annotations. Applying our methods to the vanilla GPT-3 model, we saw that machinegenerated annotations were 78.54% correct and the fine-tuned model achieved a 96.01% This result shows that machine-generated annotations are an resource and cost effective way to fine-tune down-stream models.
Confidence Is All You Need for MI Attacks
Sinha, Abhishek, Tibrewal, Himanshi, Gupta, Mansi, Waghela, Nikhar, Garg, Shivank
In this evolving era of machine learning security, membership inference attacks have emerged as a potent threat to the confidentiality of sensitive data. In this attack, adversaries aim to determine whether a particular point was used during the training of a target model. This paper proposes a new method to gauge a data point's membership in a model's training set. Instead of correlating loss with membership, as is traditionally done, we have leveraged the fact that training examples generally exhibit higher confidence values when classified into their actual class. During training, the model is essentially being 'fit' to the training data and might face particular difficulties in generalization to unseen data. This asymmetry leads to the model achieving higher confidence on the training data as it exploits the specific patterns and noise present in the training data. Our proposed approach leverages the confidence values generated by the machine learning model. These confidence values provide a probabilistic measure of the model's certainty in its predictions and can further be used to infer the membership of a given data point. Additionally, we also introduce another variant of our method that allows us to carry out this attack without knowing the ground truth(true class) of a given data point, thus offering an edge over existing label-dependent attack methods.
Explainable Patterns for Distinction and Prediction of Moral Judgement on Reddit
Efstathiadis, Ion Stagkos, Paulino-Passos, Guilherme, Toni, Francesca
The forum r/AmITheAsshole in Reddit hosts discussion on moral issues based on concrete narratives presented by users. Existing analysis of the forum focuses on its comments, and does not make the underlying data publicly available. In this paper we build a new dataset of comments and also investigate the classification of the posts in the forum. Further, we identify textual patterns associated with the provocation of moral judgement by posts, with the expression of moral stance in comments, and with the decisions of trained classifiers of posts and comments.
- Europe > United Kingdom > England > Greater London > London (0.05)
- Asia > India (0.04)
Introduction to PyTorch
Recently, Microsoft and PyTorch announced a "PyTorch Fundamentals" tutorial, which you can find on Microsoft's site and on PyTorch's site. The code in this post is based on the code appearing in that tutorial, and forms the foundation for a series of other posts, where I'll explore other machine learning frameworks and show integration with Azure ML. In this post, I'll explain how you can create a basic neural network in PyTorch, using the Fashion MNIST dataset as a data source. The neural network we'll build takes as input images of clothing, and classifies them according to their contents, such as "Shirt," "Coat," or "Dress." I'll assume that you have a basic conceptual understanding of neural networks, and that you're comfortable with Python, but I assume no knowledge of PyTorch. Let's start by getting familiar with the data we'll be using, the Fashion MNIST dataset. This dataset contains 70,000 grayscale images of articles of clothing -- 60,000 meant to be used for training and 10,000 meant for testing.
Loss Functions
Today is a new day, a day of adventure and mountain climbing! So like the good student you are, you attended today's class but didn't understand:( Luckily, you got me, your personal professor. I asked your classmates about today's class and they told me that the professor taught you about Loss Functions, some even told me that he taught them how to climb down from different mountains. Well, grab your hiking gear and follow my lead, we are going to climb down from a high mountain, higher than Everest itself. Do you remember that the objective of training the neural network is to try to minimize the loss between the predictions and the actual values?
Loss Functions
Today is a new day, a day of adventure and mountain climbing! So like the good student you are, you attended today's class but didn't understand:( Luckily, you got me, your personal professor. I asked your classmates about today's class and they told me that the professor taught you about Loss Functions, some even told me that he taught them how to climb down from different mountains. Well, grab your hiking gear and follow my lead, we are going to climb down from a high mountain, higher than Everest itself. Do you remember that the objective of training the neural network is to try to minimize the loss between the predictions and the actual values?